This replication assignment is based off of the following publication:
Jensen, S. A., Mundry, R., Nunn, C. L., Boesch, C., & Leendertz, F. H. (2009). Non-invasive body temperature measurement of wild chimpanzees using fecal temperature decline. Journal of wildlife diseases, 45(2), 542-546.
This paper attempts to establish a non-invasive method for measuring body temperature in a wild population of chimpanzees, since body temperature can be an indicator of overall health. Individuals with a high pathogen load or disease will likely experience a notable increase or decrease in body temperature similar to humans.
This non-invasive method involves measuring fecal temperature immediately after defecation. Fecal temperature cannot be used as a direct measurement of body temperature though. Thus, Jensen et al. (2009) measured rectal temperature followed by a series of fecal temperatures in human participants. From rectal and fecal temperatures, the sigmoidal curve of the fecal temperatures was compared to the rectal temperature to establish the difference between the two sets of measurements.
To demonstrate the nature of the data, the authors included a figure (Figure 1) of an example of one of the human samples. While I did not select the same sample, I will also be graphing a human sample
# Bringing the data in:
example_curve<-read.table(file="Figure1Data.txt",header=T, sep="\t")
example_curve
## Time Time_C Temperature
## 1 31.0 0.0 37.3
## 2 33.5 2.5 36.6
## 3 34.1 3.1 36.6
## 4 34.4 3.4 36.5
## 5 35.0 4.0 36.5
## 6 35.3 4.3 36.4
## 7 36.0 5.0 36.3
## 8 36.3 5.3 36.2
## 9 37.0 6.0 36.1
## 10 37.3 6.3 36.0
## 11 37.5 6.5 35.8
## 12 38.3 7.3 35.7
## 13 38.5 7.5 35.7
## 14 39.2 8.2 35.6
## 15 39.4 8.4 35.5
## 16 40.0 9.0 35.5
## 17 40.3 9.3 35.4
## 18 40.5 9.5 35.3
## 19 41.1 10.1 35.2
## 20 41.3 10.3 35.2
## 21 41.5 10.5 35.2
## 22 42.1 11.1 35.1
## 23 42.3 11.3 35.1
## 24 42.5 11.5 35.0
## 25 43.1 12.1 35.0
## 26 43.3 12.3 35.0
## 27 43.5 12.5 34.9
## 28 44.1 13.1 34.9
## 29 44.3 13.3 34.8
## 30 44.5 13.5 34.8
## 31 45.1 14.1 34.8
## 32 45.3 14.3 34.7
## 33 45.5 14.5 34.7
## 34 46.1 15.1 34.7
## 35 46.3 15.3 34.6
## 36 46.5 15.5 34.6
## 37 47.1 16.1 34.6
## 38 47.3 16.3 34.5
## 39 48.0 17.0 34.5
## 40 48.2 17.2 34.4
## 41 48.4 17.4 34.4
## 42 49.0 18.0 34.3
## 43 49.2 18.2 34.3
## 44 49.4 18.4 34.3
## 45 50.0 19.0 34.2
## 46 50.2 19.2 34.2
## 47 50.4 19.4 34.1
## 48 51.0 20.0 34.1
## 49 51.3 20.3 34.0
## 50 51.5 20.5 34.0
## 51 52.1 21.1 34.0
## 52 52.3 21.3 34.0
## 53 53.1 22.1 33.9
## 54 53.3 22.3 33.9
## 55 53.5 22.5 34.0
## 56 53.5 22.5 33.9
## 57 54.1 23.1 33.8
## 58 54.3 23.3 33.8
## 59 54.5 23.5 33.8
## 60 55.1 24.1 33.7
## 61 55.3 24.3 33.6
## 62 55.5 24.5 33.5
## 63 56.1 25.1 33.5
## 64 56.3 25.3 33.5
## 65 56.5 25.5 33.4
## 66 57.1 26.1 33.4
## 67 57.3 26.3 33.4
## 68 57.5 26.5 33.4
## 69 58.1 27.1 33.4
## 70 58.3 27.3 33.3
## 71 58.5 27.5 33.3
## 72 59.0 28.0 33.3
## 73 59.2 28.2 33.3
## 74 59.4 28.4 33.3
## 75 60.0 29.0 33.2
## 76 60.2 29.2 33.2
## 77 60.4 29.4 33.2
## 78 61.1 30.1 33.2
## 79 61.3 30.3 33.2
## 80 61.5 30.5 33.1
## 81 62.1 31.1 33.1
## 82 62.3 31.3 33.1
## 83 62.4 31.4 33.1
## 84 63.0 32.0 33.0
In this data, Time is given based off of a number shown on a voice recording, Time_C is a corrected time, and Temperature is the fecal temperature provided in degrees Celcius.
library(ggplot2)
p <- ggplot(example_curve, aes(Time_C, Temperature))
p <- p + geom_point()+ labs(xlab("Time after Defecation (Minutes)")) + labs(ylab("Temperature (Celcius)"))
p
This is the basic trend that we expect for all of the data. As Time after Defecation increases, we would expect the Temperature of the fecal sample to decrease. In samples where there is a short rise followed by a decrease in temperature, the points prior to the maximum temperature were discarded. These would be readings likely taken when temperature was adjusting and not accurately recording temperature of the sample.
The equation to calculate the sigmoid curve for each temperature reading was determined from rectal and fecal temperature readings:
where T is temperature (at a given time), t is time since defecation, and a, b, c, and d are coeficcients that describe the shape of the sigmoid curve.
The following chunk of code allows us to bring in multiple text files that are all named using a particular coding system. In this case, all text files are named ’chimp*’ where the asterisks would be the chimp ID number.
# read txt files with names of the form Patient*.txt
txt_files <- list.files(path= ".", pattern = "chimp*", all.files= TRUE)
txt_files
## [1] "chimp1.txt" "chimp2.txt" "chimp3.txt"
# read txt files into a list (assuming separator is a comma)
xdata <- lapply(txt_files, read.table, header=T, sep = "\t")
xdata
## [[1]]
## time temperature
## 1 53.17 35.6
## 2 53.83 35.6
## 3 54.50 35.6
## 4 54.92 35.6
## 5 55.33 35.6
## 6 55.83 35.6
## 7 56.35 35.6
## 8 56.78 35.5
## 9 57.33 35.5
## 10 57.83 35.5
## 11 58.22 35.4
## 12 58.67 35.4
## 13 59.10 35.3
## 14 59.57 35.3
## 15 60.05 35.2
## 16 60.50 35.1
## 17 61.00 35.1
## 18 61.43 35.0
## 19 61.92 34.9
## 20 62.35 34.8
## 21 62.78 34.7
## 22 63.23 34.7
## 23 63.70 34.6
## 24 64.17 34.5
## 25 64.63 34.4
## 26 65.07 34.4
## 27 65.50 34.3
## 28 65.95 34.2
## 29 66.38 34.1
## 30 66.83 34.0
## 31 67.27 33.9
## 32 67.75 33.9
## 33 68.22 33.8
## 34 68.72 33.7
## 35 69.15 33.6
## 36 69.58 33.5
## 37 70.05 33.5
## 38 70.50 33.4
## 39 70.97 33.4
## 40 71.42 33.3
## 41 71.87 33.3
## 42 72.33 33.2
## 43 72.80 33.2
## 44 73.27 33.1
## 45 73.70 33.1
## 46 74.15 33.0
## 47 74.62 33.0
## 48 75.08 32.9
## 49 75.53 32.9
## 50 76.00 32.9
## 51 76.43 32.8
## 52 76.92 32.7
## 53 77.52 32.7
## 54 78.00 32.6
## 55 78.42 32.6
## 56 78.87 32.6
## 57 79.32 32.5
## 58 79.75 32.5
## 59 80.20 32.4
## 60 80.67 32.4
## 61 81.08 32.4
## 62 81.50 32.3
## 63 81.95 32.3
## 64 82.38 32.3
## 65 82.85 32.2
## 66 83.37 32.3
## 67 83.75 32.2
## 68 84.17 32.2
## 69 84.53 32.1
## 70 85.08 32.1
## 71 85.50 32.1
## 72 85.98 32.1
## 73 86.42 32.0
##
## [[2]]
## time temperature
## 1 10.67 36.0
## 2 11.17 35.9
## 3 11.50 35.8
## 4 11.80 35.7
## 5 12.08 35.6
## 6 12.50 35.5
## 7 13.08 35.4
## 8 13.67 35.3
## 9 14.23 35.2
## 10 14.83 35.1
## 11 15.33 35.1
## 12 16.00 34.9
## 13 16.83 34.7
## 14 18.17 34.2
## 15 18.75 34.0
## 16 19.50 33.9
## 17 20.00 33.7
## 18 21.00 33.7
## 19 21.83 33.5
## 20 22.67 33.1
## 21 23.67 32.7
## 22 24.17 32.3
## 23 25.00 32.3
## 24 25.67 32.3
## 25 26.33 32.2
## 26 26.67 32.0
##
## [[3]]
## time temperature
## 1 9.08 37.5
## 2 9.25 37.2
## 3 9.55 37.2
## 4 9.73 37.3
## 5 9.83 37.3
## 6 9.95 37.3
## 7 10.08 37.4
## 8 10.40 37.4
## 9 10.47 37.4
## 10 10.73 37.4
## 11 10.85 37.4
## 12 10.98 37.4
## 13 11.17 37.4
## 14 11.33 37.4
## 15 11.45 37.4
## 16 11.55 37.4
## 17 11.72 37.4
## 18 11.82 37.4
## 19 11.93 37.4
## 20 12.02 37.3
## 21 12.12 37.3
## 22 12.22 37.3
## 23 12.33 37.3
## 24 12.45 37.3
## 25 12.58 37.3
## 26 12.70 37.2
## 27 12.88 37.2
## 28 12.98 37.2
## 29 13.07 37.2
## 30 13.17 37.2
## 31 13.27 37.2
## 32 13.38 37.1
## 33 13.57 37.1
## 34 13.68 37.1
## 35 13.80 37.1
## 36 13.88 37.0
## 37 14.00 37.0
## 38 14.08 37.0
## 39 14.18 37.0
## 40 14.30 36.9
## 41 14.42 36.9
## 42 14.48 36.9
## 43 14.58 36.9
## 44 14.67 36.9
## 45 14.73 36.9
## 46 14.90 36.9
## 47 15.00 36.9
Using the following code, we can then calculate the four coeficcients (a,b,c,d) for each sample using the following code:
# read txt files with names of the form Patient*.txt
txt_files <- list.files(path= ".", pattern = "chimp*", all.files= TRUE)
# read txt files into a list (assuming separator is a comma)
xdata <- lapply(txt_files, read.table, header=T, sep = "\t")
# Get order of values in time (although all the data is already ordered according to time)
for (j in 1:3) {
o<-order(xdata[[j]]$time)
xdata[[j]]<-xdata[[j]][o,]
# Calculate the number of entries in data
nmeas<-nrow(xdata[[j]])
# Create a vector for the selection variable
sel<-rep(1, nmeas)
# Mark data points as being smaller than a subsequent one
TEST <- xdata[[j]]$temperature
for (i in 1:(nmeas-1)) {
if (TEST[i]< max(TEST[(i+1):nmeas]))
{
sel[i]=0
}
}
#Set start values of Coefficients:
c1<- max(xdata[[j]]$temperature) - min(xdata[[j]]$temperature)
c2<-(max(xdata[[j]]$time)+min(xdata[[j]]$time))/2
c3<-c2/6
c4<-min(xdata[[j]]$temperature)
xdata[[j]]<-cbind(xdata[[j]],sel)
data2<-subset(xdata[[j]],xdata[[j]]$sel==1)
# Do the fitting:
res<- nls(data2$temperature~c4+c1*1/(1+exp(((data2$time-c2)/c3))), data= data2, start= list(c1=c1, c2=c2, c3=c3, c4=c4))
# Make result accessible
results<-as.vector(coef(res))
print(results)
}
## [1] 4.318488 66.780031 6.465146 31.932761
## [1] 7.831914 21.383520 7.015731 29.479575
## [1] 0.6620789 13.4957935 0.8161342 36.7756314
Sometimes memes don’t have proper grammar…. but we are so happy about doing NESTED for loops that we don’t care about proper grammar!
Using the equation above (and the coeficcients computed by the above code), the estimated rectal temperature of a given individual can be obtained by setting variable t to t=0.
Once the estimates for the rectal temperature were obtained using the fecal temperatures, the measured temperature (rectal) and estimated temperature (fecal) were compared. The authors predicted that fecal temperature would be an accurate estimate for rectal temperature.
First, I will load the measured and estimated temperatures:
meas_est<-read.table(file="HumanFecalSamples.txt",header=T, sep="\t")
attach(meas_est)
meas_est
## Measured Estimated
## 1 36.8 36.83478
## 2 37.3 37.24003
## 3 36.9 37.07576
## 4 36.8 36.83302
## 5 36.9 37.10025
## 6 37.0 37.02749
## 7 37.1 36.25157
## 8 37.5 37.38725
## 9 36.9 37.44701
## 10 37.0 37.19265
## 11 36.9 37.19209
## 12 36.7 36.88094
## 13 37.0 36.88861
## 14 37.1 35.44987
## 15 37.0 37.03043
## 16 37.4 37.22846
## 17 37.5 37.59391
## 18 36.9 37.16528
## 19 36.7 37.05113
## 20 36.9 36.71054
## 21 37.2 37.34197
## 22 37.6 37.51385
## 23 37.5 37.57481
## 24 36.9 36.96437
## 25 37.0 36.72163
## 26 37.3 37.24190
## 27 36.7 36.70102
## 28 37.1 36.81899
## 29 37.7 37.81014
The authors conducted a Spearman’s Rank Correlation Test on the data:
So, now I will run a Spearman’s Rank Correlation comparing measured and estimated temperatures. This test will establish whether there is a relationship between the measured and estimated temperatures.
Spear<- cor.test(Measured, Estimated, method= "spearman")
## Warning in cor.test.default(Measured, Estimated, method = "spearman"):
## Cannot compute exact p-value with ties
Spear
##
## Spearman's rank correlation rho
##
## data: Measured and Estimated
## S = 1717.9, p-value = 0.001053
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.5768731
While not receiving the identical value of rho, my value is roughly comparable (off by 0.05) and the p-value remains less than or equal to 0.001, which was reported in the study. AND this means that measured and estimated temperatures are indeed related to one another!
Next, the authors reported, “The average (arithmetic mean) absolute deviation between estimates and actual rectal temperatures was 0.22 C (n529)” (543). To calculate this deviation, I will calculate the difference between the measured and estimated temperatures. Next, I will take the absolute value of these differences. From the absolute difference, I will take the average:
for(i in 1:length(meas_est$Measured)){
meas_est$Difference[i]<- meas_est$Estimated[i]-meas_est$Measured[i]}
meas_est$Absolute<- abs(meas_est$Difference)
meas_est
## Measured Estimated Difference Absolute
## 1 36.8 36.83478 0.03478216 0.03478216
## 2 37.3 37.24003 -0.05997464 0.05997464
## 3 36.9 37.07576 0.17576484 0.17576484
## 4 36.8 36.83302 0.03302034 0.03302034
## 5 36.9 37.10025 0.20025284 0.20025284
## 6 37.0 37.02749 0.02749091 0.02749091
## 7 37.1 36.25157 -0.84843283 0.84843283
## 8 37.5 37.38725 -0.11274773 0.11274773
## 9 36.9 37.44701 0.54700868 0.54700868
## 10 37.0 37.19265 0.19264954 0.19264954
## 11 36.9 37.19209 0.29209113 0.29209113
## 12 36.7 36.88094 0.18093542 0.18093542
## 13 37.0 36.88861 -0.11138645 0.11138645
## 14 37.1 35.44987 -1.65012626 1.65012626
## 15 37.0 37.03043 0.03043348 0.03043348
## 16 37.4 37.22846 -0.17153916 0.17153916
## 17 37.5 37.59391 0.09390821 0.09390821
## 18 36.9 37.16528 0.26527801 0.26527801
## 19 36.7 37.05113 0.35113319 0.35113319
## 20 36.9 36.71054 -0.18946276 0.18946276
## 21 37.2 37.34197 0.14197062 0.14197062
## 22 37.6 37.51385 -0.08614615 0.08614615
## 23 37.5 37.57481 0.07480759 0.07480759
## 24 36.9 36.96437 0.06436846 0.06436846
## 25 37.0 36.72163 -0.27837373 0.27837373
## 26 37.3 37.24190 -0.05810086 0.05810086
## 27 36.7 36.70102 0.00102466 0.00102466
## 28 37.1 36.81899 -0.28101294 0.28101294
## 29 37.7 37.81014 0.11013738 0.11013738
averageAbsDiff <- mean(meas_est$Absolute)
averageAbsDiff
## [1] 0.2298056
Huzah it matches!
Next the authors reported, “Fifteen of the estimates differed less than 0.12 C from the actual value, whereas only three differed by more than 0.5 C from the rectal temperature” (543). We can easily count how many of the results are below 0.12 C or greater than 0.5 C.
countLesser <- 0
for(i in 1:length(meas_est$Measured)){
if(meas_est$Absolute[i] <= 0.12){
countLesser <- countLesser + 1
}
}
countLesser
## [1] 14
Not exactly correct…. and I can’t find anything wrong with my code (manually counted 14 samples below 0.12 C)… grr!
countGreater <- 0
for(i in 1:length(meas_est$Measured)){
if(meas_est$Absolute[i] > 0.5){
countGreater <- countGreater + 1
}
}
countGreater
## [1] 3
Perfection!
Unfortunately, conducting a Wilcoxon Test on their data with the wilcox.test() function does not yield the same results as published in Jensen et al. (2009):
Will <- wilcox.test(meas_est$Measured, meas_est$Estimated, paired=TRUE)
Will
##
## Wilcoxon signed rank test
##
## data: meas_est$Measured and meas_est$Estimated
## V = 189, p-value = 0.5504
## alternative hypothesis: true location shift is not equal to 0
Thus, I will do the Wilcoxon Signed Rank Test by hand, where I specifically calculate V/T for the overestimated temperatures (+):
for(i in 1:length(meas_est$Measured)){
meas_est$Sign[i]<- sign(meas_est$Difference[i])}
meas_est
## Measured Estimated Difference Absolute Sign
## 1 36.8 36.83478 0.03478216 0.03478216 1
## 2 37.3 37.24003 -0.05997464 0.05997464 -1
## 3 36.9 37.07576 0.17576484 0.17576484 1
## 4 36.8 36.83302 0.03302034 0.03302034 1
## 5 36.9 37.10025 0.20025284 0.20025284 1
## 6 37.0 37.02749 0.02749091 0.02749091 1
## 7 37.1 36.25157 -0.84843283 0.84843283 -1
## 8 37.5 37.38725 -0.11274773 0.11274773 -1
## 9 36.9 37.44701 0.54700868 0.54700868 1
## 10 37.0 37.19265 0.19264954 0.19264954 1
## 11 36.9 37.19209 0.29209113 0.29209113 1
## 12 36.7 36.88094 0.18093542 0.18093542 1
## 13 37.0 36.88861 -0.11138645 0.11138645 -1
## 14 37.1 35.44987 -1.65012626 1.65012626 -1
## 15 37.0 37.03043 0.03043348 0.03043348 1
## 16 37.4 37.22846 -0.17153916 0.17153916 -1
## 17 37.5 37.59391 0.09390821 0.09390821 1
## 18 36.9 37.16528 0.26527801 0.26527801 1
## 19 36.7 37.05113 0.35113319 0.35113319 1
## 20 36.9 36.71054 -0.18946276 0.18946276 -1
## 21 37.2 37.34197 0.14197062 0.14197062 1
## 22 37.6 37.51385 -0.08614615 0.08614615 -1
## 23 37.5 37.57481 0.07480759 0.07480759 1
## 24 36.9 36.96437 0.06436846 0.06436846 1
## 25 37.0 36.72163 -0.27837373 0.27837373 -1
## 26 37.3 37.24190 -0.05810086 0.05810086 -1
## 27 36.7 36.70102 0.00102466 0.00102466 1
## 28 37.1 36.81899 -0.28101294 0.28101294 -1
## 29 37.7 37.81014 0.11013738 0.11013738 1
for(i in 1:length(meas_est$Measured)){
meas_est$Rank<- rank(meas_est$Absolute)}
meas_est
## Measured Estimated Difference Absolute Sign Rank
## 1 36.8 36.83478 0.03478216 0.03478216 1 5
## 2 37.3 37.24003 -0.05997464 0.05997464 -1 7
## 3 36.9 37.07576 0.17576484 0.17576484 1 17
## 4 36.8 36.83302 0.03302034 0.03302034 1 4
## 5 36.9 37.10025 0.20025284 0.20025284 1 21
## 6 37.0 37.02749 0.02749091 0.02749091 1 2
## 7 37.1 36.25157 -0.84843283 0.84843283 -1 28
## 8 37.5 37.38725 -0.11274773 0.11274773 -1 14
## 9 36.9 37.44701 0.54700868 0.54700868 1 27
## 10 37.0 37.19265 0.19264954 0.19264954 1 20
## 11 36.9 37.19209 0.29209113 0.29209113 1 25
## 12 36.7 36.88094 0.18093542 0.18093542 1 18
## 13 37.0 36.88861 -0.11138645 0.11138645 -1 13
## 14 37.1 35.44987 -1.65012626 1.65012626 -1 29
## 15 37.0 37.03043 0.03043348 0.03043348 1 3
## 16 37.4 37.22846 -0.17153916 0.17153916 -1 16
## 17 37.5 37.59391 0.09390821 0.09390821 1 11
## 18 36.9 37.16528 0.26527801 0.26527801 1 22
## 19 36.7 37.05113 0.35113319 0.35113319 1 26
## 20 36.9 36.71054 -0.18946276 0.18946276 -1 19
## 21 37.2 37.34197 0.14197062 0.14197062 1 15
## 22 37.6 37.51385 -0.08614615 0.08614615 -1 10
## 23 37.5 37.57481 0.07480759 0.07480759 1 9
## 24 36.9 36.96437 0.06436846 0.06436846 1 8
## 25 37.0 36.72163 -0.27837373 0.27837373 -1 23
## 26 37.3 37.24190 -0.05810086 0.05810086 -1 6
## 27 36.7 36.70102 0.00102466 0.00102466 1 1
## 28 37.1 36.81899 -0.28101294 0.28101294 -1 24
## 29 37.7 37.81014 0.11013738 0.11013738 1 12
T_pos <- 0
T_neg <- 0
for(i in 1:length(meas_est$Measured)){
if(meas_est$Sign[i] > 0){
T_pos <- T_pos + meas_est$Rank[i]
}
else{
T_neg <- T_neg + meas_est$Rank[i]
}
}
T_pos
## [1] 246
Again, this is very close, but just slightly off from the published value. I predict that they likely switched the rank for two of their samples when doing this analysis by hand. The reason that this needed to be done by hand is that they were specifically looking for the estimated temperatures that were overestimates of actual temperature (when you use the wilcox.test() function in R, it gives you the V value for the estimate temperatures that underestimate actual temperature).
Enjoy reading about poop data (as I have been calling it)! And this dancing chimp gif!